skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Wang, Ran"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Objective.This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e. Electrocorticographic (ECoG) or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface ECoG and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements. The model should not have subject-specific layers and the trained model should perform well on participants unseen during training.Approach.We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train subject-specific models using data from a single participant and multi-subject models exploiting data from multiple participants.Main results.The subject-specific models using only low-density 8 × 8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC = 0.817), overN= 43 participants, significantly outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (N= 39) led to further improvement (PCC = 0.838). For participants with only sEEG electrodes (N= 9), subject-specific models still enjoy comparable performance with an average PCC = 0.798. A single multi-subject model trained on ECoG data from 15 participants yielded comparable results (PCC = 0.837) as 15 models trained individually for these participants (PCC = 0.831). Furthermore, the multi-subject models achieved high performance on unseen participants, with an average PCC = 0.765 in leave-one-out cross-validation.Significance.The proposed SwinTW decoder enables future speech decoding approaches to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. The success of the single multi-subject model when tested on participants within the training cohort demonstrates that the model architecture is capable of exploiting data from multiple participants with diverse electrode placements. The architecture’s flexibility in training with both single-subject and multi-subject data, as well as grid and non-grid electrodes, ensures its broad applicability. Importantly, the generalizability of the multi-subject models in our study population suggests that a model trained using paired acoustic and neural data from multiple patients can potentially be applied to new patients with speech disability where acoustic-neural training data is not feasible. 
    more » « less
  2. When we vocalize, our brain distinguishes self-generated sounds from external ones. A corollary discharge signal supports this function in animals; however, in humans, its exact origin and temporal dynamics remain unknown. We report electrocorticographic recordings in neurosurgical patients and a connectivity analysis framework based on Granger causality that reveals major neural communications. We find a reproducible source for corollary discharge across multiple speech production paradigms localized to the ventral speech motor cortex before speech articulation. The uncovered discharge predicts the degree of auditory cortex suppression during speech, its well-documented consequence. These results reveal the human corollary discharge source and timing with far-reaching implication for speech motor-control as well as auditory hallucinations in human psychosis. 
    more » « less
  3. Abstract Located at northern latitudes and subject to large seasonal temperature fluctuations, boreal forests are sensitive to the changing climate, with evidence for both increasing and decreasing productivity, depending upon conditions. Optical remote sensing of vegetation indices based on spectral reflectance offers a means of monitoring vegetation photosynthetic activity and provides a powerful tool for observing how boreal forests respond to changing environmental conditions. Reflectance‐based remotely sensed optical signals at northern latitude or high‐altitude regions are readily confounded by snow coverage, hampering applications of satellite‐based vegetation indices in tracking vegetation productivity at large scales. Unraveling the effects of snow can be challenging from satellite data, particularly when validation data are lacking. In this study, we established an experimental system in Alberta, Canada including six boreal tree species, both evergreen and deciduous, to evaluate the confounding effects of snow on three vegetation indices: the normalized difference vegetation index (NDVI), the photochemical reflectance index (PRI), and the chlorophyll/carotenoid index (CCI), all used in tracking vegetation productivity for boreal forests. Our results revealed substantial impacts of snow on canopy reflectance and vegetation indices, expressed as increased albedo, decreased NDVI values and increased PRI and CCI values. These effects varied among species and functional groups (evergreen and deciduous) and different vegetation indices were affected differently, indicating contradictory, confounding effects of snow on these indices. In addition to snow effects, we evaluated the contribution of deciduous trees to vegetation indices in mixed stands of evergreen and deciduous species, which contribute to the observed relationship between greenness‐based indices and ecosystem productivity of many evergreen‐dominated forests that contain a deciduous component. Our results demonstrate confounding and interacting effects of snow and vegetation type on vegetation indices and illustrate the importance of explicitly considering snow effects in any global‐scale photosynthesis monitoring efforts using remotely sensed vegetation indices. 
    more » « less
  4. Decoding human speech from neural signals is essential for brain–computer interface (BCI) technologies that aim to restore speech in populations with neurological deficits. However, it remains a highly challenging task, compounded by the scarce availability of neural signals with corresponding speech, data complexity and high dimensionality. Here we present a novel deep learning-based neural speech decoding framework that includes an ECoG decoder that translates electrocorticographic (ECoG) signals from the cortex into interpretable speech parameters and a novel differentiable speech synthesizer that maps speech parameters to spectrograms. We have developed a companion speech-to-speech auto-encoder consisting of a speech encoder and the same speech synthesizer to generate reference speech parameters to facilitate the ECoG decoder training. This framework generates natural-sounding speech and is highly reproducible across a cohort of 48 participants. Our experimental results show that our models can decode speech with high correlation, even when limited to only causal operations, which is necessary for adoption by real-time neural prostheses. Finally, we successfully decode speech in participants with either left or right hemisphere coverage, which could lead to speech prostheses in patients with deficits resulting from left hemisphere damage. 
    more » « less
  5. This study investigates speech decoding from neural signals captured by intracranial electrodes. Most prior works can only work with electrodes on a 2D grid (i.e., Electrocorticographic or ECoG array) and data from a single patient. We aim to design a deep-learning model architecture that can accommodate both surface (ECoG) and depth (stereotactic EEG or sEEG) electrodes. The architecture should allow training on data from multiple participants with large variability in electrode placements and the trained model should perform well on participants unseen during training. Approach We propose a novel transformer-based model architecture named SwinTW that can work with arbitrarily positioned electrodes, by leveraging their 3D locations on the cortex rather than their positions on a 2D grid. We train both subject-specific models using data from a single participant as well as multi-patient models exploiting data from multiple participants. Main Results The subject-specific models using only low-density 8x8 ECoG data achieved high decoding Pearson Correlation Coefficient with ground truth spectrogram (PCC=0.817), over N=43 participants, outperforming our prior convolutional ResNet model and the 3D Swin transformer model. Incorporating additional strip, depth, and grid electrodes available in each participant (N=39) led to further improvement (PCC=0.838). For participants with only sEEG electrodes (N=9), subject-specific models still enjoy comparable performance with an average PCC=0.798. The multi-subject models achieved high performance on unseen participants, with an average PCC=0.765 in leave-one-out cross-validation. Significance The proposed SwinTW decoder enables future speech neuropros-theses to utilize any electrode placement that is clinically optimal or feasible for a particular participant, including using only depth electrodes, which are more routinely implanted in chronic neurosurgical procedures. Importantly, the generalizability of the multi-patient models suggests the exciting possibility of developing speech neuropros-theses for people with speech disability without relying on their own neural data for training, which is not always feasible. 
    more » « less
  6. Speech production is a complex human function requiring continuous feedforward commands together with reafferent feedback processing. These processes are carried out by distinct frontal and temporal cortical networks, but the degree and timing of their recruitment and dynamics remain poorly understood. We present a deep learning architecture that translates neural signals recorded directly from the cortex to an interpretable representational space that can reconstruct speech. We leverage learned decoding networks to disentangle feedforward vs. feedback processing. Unlike prevailing models, we find a mixed cortical architecture in which frontal and temporal networks each process both feedforward and feedback information in tandem. We elucidate the timing of feedforward and feedback–related processing by quantifying the derived receptive fields. Our approach provides evidence for a surprisingly mixed cortical architecture of speech circuitry together with decoding advances that have important implications for neural prosthetics. 
    more » « less
  7. Abstract The [Cii] 158μm emission line and the underlying far-infrared (FIR) dust continuum are important tracers for studying star formation and kinematic properties of early galaxies. We present a survey of the [Cii] emission lines and FIR continua of 31 luminous quasars atz> 6.5 using the Atacama Large Millimeter Array (ALMA) and the NOrthern Extended Millimeter Array at sub-arcsec resolution. This survey more than doubles the number of quasars with [Cii] and FIR observations at these redshifts and enables statistical studies of quasar host galaxies deep into the epoch of reionization. We detect [Cii] emission in 27 quasar hosts with a luminosity range ofL[CII]= (0.3–5.5) × 109Land detect the FIR continuum of 28 quasar hosts with a luminosity range ofLFIR= (0.5–13.0) × 1012L. BothL[CII]andLFIRare correlated (ρ≃ 0.4) with the quasar bolometric luminosity, albeit with substantial scatter. The quasar hosts detected by ALMA are clearly resolved with a median diameter of ∼5 kpc. About 40% of the quasar host galaxies show a velocity gradient in [Cii] emission, while the rest show either dispersion-dominated or disturbed kinematics. Basic estimates of the dynamical masses of the rotation-dominated host galaxies yieldMdyn= (0.1–7.5) × 1011M. Considering our findings alongside those of literature studies, we found that the ratio betweenMBHandMdynis about 10 times higher than that of localMBH–Mdynrelation on average but with substantial scatter (the ratio difference ranging from ∼0.6 to 60) and large uncertainties. 
    more » « less
  8. Abstract Dual active galactic nuclei (AGNs), which are the manifestation of two actively accreting supermassive black holes (SMBHs) hosted by a pair of merging galaxies, are a unique laboratory for studying the physics of SMBH feeding and feedback during an indispensable stage of galaxy evolution. In this work, we present NOEMA CO(2–1) observations of seven kiloparsec-scale dual-AGN candidates drawn from a recent Chandra survey of low redshift, optically classified AGN pairs. These systems are selected because they show unexpectedly low 2–10 keV X-ray luminosities for their small physical separations signifying an intermediate-to-late stage of merger. Circumnuclear molecular gas traced by the CO(2–1) emission is significantly detected in six of the seven pairs and 10 of the 14 nuclei, with an estimated mass ranging between (0.2–21) × 10 9 M ⊙ . The primary nuclei, i.e., the ones with the higher stellar velocity dispersion, tend to have a higher molecular gas mass than the secondary. Most CO-detected nuclei show a compact morphology, with a velocity field consistent with a kiloparsec-scale rotating structure. The inferred hydrogen column densities range between 5 × 10 21 –2 × 10 23 cm −2 , but mostly at a few times 10 22 cm −2 , in broad agreement with those derived from X-ray spectral analysis. Together with the relatively weak mid-infrared emission, the moderate column density argues against the prevalence of heavily obscured, intrinsically luminous AGNs in these seven systems, but favors a feedback scenario in which AGN activity triggered by a recent pericentric passage of the galaxy pair can expel circumnuclear gas and suppress further SMBH accretion. 
    more » « less